Unity in Diversity: Discovering Topics from Words - Information Theoretic Co-clustering for Visual Categorization

نویسندگان

  • Ashish Gupta
  • Richard Bowden
چکیده

This paper presents a novel approach to learning a codebook for visual categorization, that resolves the key issue of intra-category appearance variation found in complex real world datasets. The codebook of visual-topics (semantically equivalent descriptors) is made by grouping visual-words (syntactically equivalent descriptors) that are scattered in feature space. We analyze the joint distribution of images and visual-words using information theoretic co-clustering to discover visual-topics. Our approach is compared with the standard ‘Bagof-Words’ approach. The statistically significant performance improvement in all the datasets utilized (Pascal VOC 2006; VOC 2007; VOC 2010; Scene-15) establishes the efficacy of our approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discipline Hotspots Mining Based on Hierarchical Dirichlet Topic Clustering and Co-word Network

Discovering inherent correlations and hot research topics among various disciplines from massive scientific documents is very important to understand the scientific research tendency. The LDA (Latent Dirichlet Allocation) topic model can find topics from big data sets, but the number of topics must to be told before topic clustering. There is a lot of randomness to determine the number of topic...

متن کامل

A Comparative Study of Spectral Clustering and Information-theoretic Co-clustering for Video Shot Categorization

Automatic categorization of video shots is important in video indexing and retrieval. To improve the effectiveness of video shot categorization, current researchers have addressed two major issues: i) spatio-temporal coherence from shot to shot, and ii) bipartite correlation between descriptive features and shot categories. In recent works, spectral clustering and information-theoretic co-clust...

متن کامل

Discovering and Analyzing the Intellectual Structure and Its Evolution in Core Journals of "Knowledge and Information Science" during 2004-2013

Purpose: This study aims to reveal the intellectual structure of Knowledge and Information Science and its evolution along with the review of journals subjective scope based on 6830 abstract in the ten core journal in the JCR 2013, over the ten years (2004-2013). Methodology: In this research, co-word and Correspondence analysis of 150 words -selected by tf-idf weight- were done after parametri...

متن کامل

یک مدل موضوعی احتمالاتی مبتنی بر روابط محلّی واژگان در پنجره‌های هم‌پوشان

A probabilistic topic model assumes that documents are generated through a process involving topics and then tries to reverse this process, given the documents and extract topics. A topic is usually assumed to be a distribution over words. LDA is one of the first and most popular topic models introduced so far. In the document generation process assumed by LDA, each document is a distribution o...

متن کامل

Contextual Document Clustering

In this paper we present a novel algorithm for document clustering. This approach is based on distributional clustering where subject related words, which have a narrow context, are identified to form metatags for that subject. These contextual words form the basis for creating thematic clusters of documents. We believe that this approach will be invaluable in creating an information retrieval ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012